A Nonlinear Observation Model from Corrupted Speech Log Me

نویسندگان

  • Jasha Droppo
  • Alex Acero
چکیده

In this paper we present a new statistical model, which describes the corruption to speech recognition Mel-frequency spectral features caused by additive noise. This model explicitly represents the effect of unknown phase together with the unobserved clean speech and noise as three hidden variables. We use this model to produce noise robust features for automatic speech recognition. The model is constructed in the log Mel-frequency feature domain. In addition to being linearly related to MFCC recognition parameters, we gain the advantage of low dimensionality and independence of the corruption across feature dimensions. We illustrate the surprising result that, even when the true noise Mel-frequency spectral feature is known, the traditional spectral subtraction formula is flawed. We show the new model can be used to derive a spectral subtraction formula which produces superior error rate results, and is less sensitive to tuning parameters. Finally, we present results demonstrating that the new model is more general than spectral subtraction, and can take advantage of a prior noise estimate to produce robust features, rather than relying on point estimates of noise.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Importance sampling to compute likelihoods of noise-corrupted speech

One way of making speech recognisers more robust to noise is model compensation. Rather than enhancing the incoming observations, model compensation techniques modify a recogniser’s state-conditional distributions so they model the speech in the target environment. Because the interaction between speech and noise is non-linear, even for Gaussian speech and noise the corrupted speech distributio...

متن کامل

Model composition by lagrange polynomial approximation for robust speech recognition in noisy environment

This paper presents a technique for estimating HMM model parameters for noisy speech from given clean speech HMM and noise HMM. The model parameters are estimated by approximating the non-linear function governing the relationship between speech and noise, by a Lagrange polynomial, and thus enabling the distribution of corrupted speech parameters to have a closed form. The method is computation...

متن کامل

Noise adaptive speech recognition based on sequential noise parameter estimation

In this paper, a noise adaptive speech recognition approach is proposed for recognizing speech which is corrupted by additive non-stationary background noise. The approach sequentially estimates noise parameters, through which a nonlinear parametric function adapts mean vectors of acoustic models. In the estimation process, posterior probability of state sequence given observation sequence and ...

متن کامل

Statistical Methods for Speech Transmission Using Hidden Markov Models

This work considers the problem of Bayesian estimation of a hidden Markov source corrupted by additive noise. We develop sequential and complete sequence Bayesian de-coders for noisy sources with memory and apply them to the log-area ratio (LAR) coeecients of speech corrupted by additive white Gaussian noise. To this end, we follow a model-based approach in which the source is approximated by a...

متن کامل

A comparison of three non-linear observation models for noisy speech features

This paper reports our recent efforts to develop a uni£ed, non-linear, stochastic model for estimating and removing the effects of additive noise on speech cepstra. The complete system consists of prior models for speech and noise, an observation model, and an inference algorithm. The observation model quanti£es the relationship between clean speech, noise, and the noisy observation. Since it i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002